Website Reconstruction using the Web Infrastructure

نویسنده

  • Frank McCown
چکیده

Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine (SE) caches if caught in time. We introduce the concept of “lazy preservation”digital preservation performed as a result of the normal operations of the Web infrastructure (search engines and caches). We will investigate methods of how websites can be automatically reconstructed from the Web infrastructure (WI) by using a web-repository crawler. We propose to evaluate and measure the effectiveness of various web-repository crawler strategies and evaluate various methods for injecting the generative functionality (e.g., CGI programs, databases, etc.) of websites into the WI. We also propose to develop methods for tracking resources as they move through the WI and to characterize SE caches using random sampling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconstructing Websites for the Lazy Webmaster

Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, “lazy” webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of “lazy preservation”digital preservati...

متن کامل

Electronic Educational and Research Services as Infrastructure for the E- Government: Role of

Introduction: Websites serve as an initial step toward an e- government adoption which facilitates delivery of online services to customers. The existing study intended to investigate the role of university website to render educational and research services based on e- government maturity model in Iranian universities. Methods: This descriptive and cross- sectional study was conducted through...

متن کامل

Identification and Classification of Desirable Web-Based Services from the Perspective of Website Users of Iran’s Hospitals Based on Kano Model of Customer Satisfaction

Background and Aim: A hospital website is an appropriate system for exchanging information and connecting patients, hospitals and medical staff. The purpose of this study was to identify and classify desirable web-based services in websites of Iran's hospitals based on Kano’s Customer Satisfaction Model. Materials and Methods: This was a survey study. The statistical population of the study co...

متن کامل

Designing a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms

Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...

متن کامل

Georeferencing Semi-Structured Place-Based Web Resources Using Machine Learning

In recent years, the shared content on the web has had significant growth. A great part of these information are publicly available in the form of semi-strunctured data. Moreover, a significant amount of these information are related to place. Such types of information refer to a location on the earth, however, they do not contain any explicit coordinates. In this research, we tried to georefer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006